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INTRODUCTION 

From  the  standpoint  of  population  genetics,  one  of  the 
most  elementary  steps  in  evolution  Is  the  change  in  gene  fre- 
quency, especially  the  change  due  to  natural  selection.   Since 
there  exist  various  factors  which  introduce  an  element  of  inde- 
terminacy into  the  process,  it  is  not  difficult  to  imagine  that 
the  process  is  continuous.   One  of  these  factors,  random  sam- 
pling of  gametes  due  to  finite  population  size,  is  of  special 
interest.   There  are  also  systematic  pressures  that  affect  gene 
frequency.   Among  these  are  selection,  migration,  and  mutation. 
The  change  due  to  selection  is  controlled  by  the  amount  of 
selection,  or  selection  intensity.   It  is  also  found  that  there 
exists  a  fluctuation  of  these  selection  intensities  from  gen- 
eration to  generation.   These  two  points  of  interest,  random 
sampling  of  gametes  and  fluctuation  of  selection  intensities, 
cause  a  phenomenon  known  as  genetic  drift. 

Genetic  drift  due  to  random  sampling  of  gametes  will  cause 
the  gene  in  question  to  become  either  completely  fixed  or  com- 
pletely lost  from  the  population  and  will  approach  one  of  these 
limits  asymptotically.   In  reaching  one  or  the  other  of  these 
limits,  the  gene  frequency  varies  as  a  stochastic  process 
( see  Fig.  1) . 

Genetic  drift  due  to  fluctuation  of  selection  intensities 
also  approaches  either  fixation  or  loss  asymptotically.   But  for 
this  case  the  gene  frequency  will  become  fixed  before  it  reaches 
complete  homozygosity  (see  Fig.  2).   Thus  if  we  have  a  pair  of 
alleles  A-.  and  Ap  and  genotypes  A-jA-,,  A-,Ap,  and  ApAp,  after  a 


generations 


Fig.  1.   Three  examples  of  genetic  drift  due  to 
random  sampling  of  gametes  in  finite  popu- 
lations.  Original  gene  frequency  is  .  5>. 
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Pig.  2.   Example  of  genetic  drift  due  to  fluctuation 
of  selection  intensity,  reaching  fixation  at 
about  .9  after  k  generations. 


certain  number  of  generations,  the  genotypes  will  become  fixed 
as  A-.A,  and  A2A2.   On  the  other  hand,  drift  due  to  random  sam- 
pling of  gametes  will  produce  all  A^-^  or  all  A2A2.   In  both 
cases  all  heterozygotes  will  be  lost. 

Mathematical  treatments  will  be  presented  for  these  cases 
of  genetic  drift. 

HISTORICAL  BACKGROUND 

Hagedoorn  Effect 

Appearing  in  1921  was  some  of  the  first  mathematics  deal- 
ing with  genetic  drift.   Fisher  (1921)  proposed  the  following: 
If  p  is  the  proportion  of  any  gene,  and  q  is  the  frequency  of 
its  allelomorph,  then  in  N  individuals  of  any  generation  we  have 
2Np  genes  scattered  at  random.   Let  cos  0=1-  2p,  where 
0  <  0  <  %.      For  a  second  generation  of  N  individuals  formed  at 
random,  the  standard  deviation  of  p  will  be 


<Tp 

thus 
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Since  this  is  independent  of  0,  Fisher  calculated  the  changes 
in  the  distribution  of  0,  in  the  absence  of  selection.   If 
y(0)d0  represents  the  distribution  of  0  in  any  one  generation, 
the  distribution  in  the  next  generation  will  be 
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is  given  by 


=  1/2N  is  very  small.   The  rate  of  change  of  y(0) 
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Since  no  distinction  has  been  drawn  between  the  gene  and 
its  allelomorph,  the  above  solutions  are  symmetric.   The 
stationary  case  is  y  =  A/it,  where  A  is  the  number  of  factors 
present  (unfixed  loci) . 

Fisher  explains  that  when  y  is  increasing, 
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and  when  y  is  decreasing, 

P  « 


7  =  A0e 


where 

,2 


P 

k  =  — 


-kT  : #  coa  p(e  _  _)  {k) 

1  2 

2  sin  -  pit 
2 


(5) 


Gene  Extinction  Due  to  Drift.   Fisher  (1921)  represents  by 
e    the  chance  that  a  particular  gene  borne  by  a  single  indi- 
vidual will  not  be  represented  in  the  next  generation.   The 
chance  of  extinction  for  a  factor  of  which  b  genes  are  in  exist- 
ence will  be  e"13*1.   When  6  is  near  zero,  p,  which  is  always  equal 

e  2 

to  sin2  -  ,  will  be  nearly  equal  to  l/I(.  6  .   Let  t  =  sin  l/2  0, 
2 

then  the  number  of  genes  in  existence  is  2Nt   and  the  chance  of 

..  .     ..   ..    .  ..       -2Nht2 

their  extinction  in  one  generation  is  e 

This  chance  is  negligible  except  when  t  is  very  small  and 

may  be  equated  to  l/2  9;  hence  the  number  of  genes  exterminated 

in  any  one  generation  is 

.  f   -2Nht2      ,  C        -2Nht2  ,.  ,M 

2    ye       d0  =  l\.    I     ye       dt  .  (6) 
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In  the  stationary  case,  y  =  A/re  and  the  number  of  genes 
exterminated  will  be 

A   2  J2% 

=   A 
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if  new  mutations  occur  at  rate  N(i,  then  gene  frequency  equilib- 
rium will  occur  at 


A=  /*N3/2 


In  the  absence  of  mutation,  there  is  extinction,  and  the 
number  of  factors  diminishes.   Considering  equation  (i\.)    when  6 


is   small,    one   gets 
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so  the   rate   of   extinction   is: 
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Equating   this    to    (5)    gives 
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when   cot  —  pot   is   of  the   order  of     ,    so   that   p   is  near  1, 

2  /n 

1  =  1/kS. 

Hagedoorn  (1921)  was  one  of  the  first  to  indicate  this 
random  effect  and  it  was  so  named  by  Fisher,  "The  Hagedoorn 
Effect".   Fisher's  value  of  l/l+N  was  later  disproved  by  Wright 
which  will  be  discussed  in  the  following  section. 


Sewall  Wright  Effect 

In  a  paper  in  1921,  Wright  gave  a  general  method  for  deter- 
mining the  decrease  in  heterozygosis.   He  stated  that  for  two 
alleles  per  locus  the  rate  of  loss  per  generation  is  l/2N  in 
the  case  of  a  breeding  population  of  N  individuals  either 
equally  divided  between  males  and  females  or  composed  of  monoe- 
cious individuals.   This  is  different  from  the  result  given  by 
Fisher  above  and  will  be  explained  later  in  this  section. 

Wright  expanded  on  the  subject  in  1931  and  gave  these 
results. 

Consider  a  population  in  which  there  are  Nm  breeding  males 

and  Nf  breeding  females,  and  random  mating.   The  proportion  of 

1 
matings  between  full  brother  and  sister  will  be  ,  that 

<NmV 

(Nm  +  Nf  -  2) 

between  half  brother  and  sister  ,  and  that  between 

(NmNf) 

(Nm  -  l)(Nf  -  1) 
less  closely  related  individuals  .   The  cor- 

relation  between  mated  individuals  may  be  written,  giving  due 
weight  to  these  three  possibilities. 

,o19r  1  Nm  +  Nf  -  2 
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where  a  ■   / is  the  path  coefficient,  gamete  to  fer- 

2(1  +  F) 

tilized  egg,  b  =  J\/2 (1  +  F ' )   is  the  path  coefficient,  zygote 

to  gamete,  and  F  is  the  correlation  between  uniting  egg  and 
sperm,  and  where  primes  are  used  to  indicate  the  number  of  gen- 
erations preceding  the  one  in  question.   The  proportional 
change  in  heterozygosis  is  given  by: 

Nm  +  Nf 
F  =  pi  +  (1  +  2F'  +  F")  . 

8NmNf 
The  proportion  of  heterozygosis 

Nm  +  Nf 

P  =  P.  _  JO. i   (2P»  -  p")  . 

8NmNf 

It  is  to  be  expected  that  the  proportional  change  per  gen- 
eration will  reach  approximate  constancy.   This  rate  was  found 
by  equating  P/P'  to  P/P"  to  be: 

Af   1  /     Nm  +  Nf 

P'     2  \  i^mNf 

This  gives  for  small  populations 


i8Nm   8Nf  \  8Nm   8Nf 

as  a  close  approximation,  and  for  large  populations 
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For  the  simplest  case  of  mating  brother  with  sister  or  Nm  =  Nf 
=  1,  the  rate  of  loss  of  heterozygosis  is  l/ij.(3  "  ^  *    or 
19.1  per  cent  per  generation.   For  the  case  Nm  =  1  and  Nf  =  «o, 
the  rate  of  loss  is  about  11  per  cent  per  generation.   For  a 
more  useful  case  in  which  there  is  a  relatively  limited  number 
of  males  but  unlimited  number  of  females,  the  rate  becomes 
1/8  N  .   An  especially  important  case  is  the  population  which 
is  equally  divided,  or  Nm  =  Nf  =  l/2  N.   In  this  case  the  rate 
is  l/(2N  +1),  or  approximately  l/2  N. 

If  only  random  mating  cases  are  considered,  then  gametes 
have  a  chance  l/N  of  coming  from  the  same  individual  and 
(N  -  1)/N  of  coming  from  different  individuals.   The  correla- 
tion between  uniting  gametes  may  then  be  written 

1  9          /N  -  l\ 
p  =  _  b2  +   l^bW' 


and 


N       \      N 


(2N  -  1) 

P  =  P'  . 

2N 


This  result  does  not  differ  appreciably  from  that  of  the  pre- 
ceding case.   The  rate  of  loss  is  exactly  l/2  N  instead  of 

1 


2N  +  1 

The  simplest  case  is  continued  self-fertilization  in  which 
N  =  1  and  the  formula  gives  50  per  cent  loss  per  generation, 
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as  would  be  expected. 

In  order  to  determine  generally  the  distribution  of  gene 
frequencies,  Wright  (1931)  considers  the  way  in  which  genes  A-^ 
with  frequency  q  are  distributed  after  one  generation  of  random 
mating.   In  a  population  of  N  breeding  individuals,  each  of  the 
specified  genes  will  have  2Nq  representatives  among  the  zygotes 
and  their  allelomorphs  2N(l  -  q) .  A   random  sample  of  the  same 
size  will  be  distributed  according  to  the  expression 

2N 


|l  -  q)A2  +  qA-J 


(8) 


The  contribution  of  this  sample  to  the  frequency  class  with  an 
allelomorphic  ratio  q-^d  -  q-j_)  will  be  in  proportion  to  the 
2Nq, th  term  of  the  above  expression  and  to  the  number  of  genes 
included  in  the  contributing  class  (f ) .   The  sum  of  contribu- 
tions from  all  such  classes  should  give  the  2Nq^^    term  an 
absolute  frequency  which  is  smaller  than  its  value  in  the  pre- 
ceding generation  (fj)  by  the  amount  l/( 2N  +1),  as  given  above, 
Thus  the  following  equation  is  given  to  solve  for  f  as  a  func- 
tion of  q. 

1  (2N)J         ST  2Nq1/       2N(l-q-,) 

2N  +  1     (2Nq1)i(2N(l  -  q±) ) ! 

Let  f  =  J?f(q)/2N  =  ${  q)  dq,  and  replacing  summation  by  integra- 
tion, the  result  is: 

*iL . £!! f1  ,"1,1  -  q>a,(1-*Wa, 

2N  +  1    (2Nq1)J(2N(l  -  qx))I  Jo 

(9) 
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The  cases  of  2  and  3  monoecious  individuals  as  worked  out  by- 
simple  algebra  suggests  an  approach  to  a  uniform  distribution. 
As  a  trial,  let  0(q)    =  C.   This  is  a  solution  since 

C  C(2N)J  I^Nq-L  +1)  |  ( 2N  -  2Nqx  +  1) 

2N  +  1    (2Nq1)I(2N(l  -  qx) )  J  p2N  +  2) 

It  would  appear  that  after  a  cross  mating  the  gene  fre- 
quencies will  spread  out  from  $0   per  cent  toward  fixation  or 
loss  until  a  practically  uniform  distribution  is  reached.   The 
frequencies  of  all  classes  will  then  decrease  at  a  rate  of 
about  1/2N  as  I/I4H  of  the  genes  become  fixed  and  l/i|N  become 
lost  per  generation  if  q  =  l/2  initially. 

Wright  (1931)  points  out  that  we  must  examine  the  terminal 
points  before  fully  accepting  this  solution.   The  amount  of 
fixation  at  the  extremes,  if  N  is  large,  can  be  found  directly 
from  the  Poisson  series.   The  contribution  to  the  zero  class 
when  the  mean  number  in  the  sample  is  e   (m  =  1,  2,  3,  •  •  • ) 
is: 

-1 

(e_1  +  e-2  +  e-3  +  .  .  .)f  =  r  f  =  .582  f. 

1  -  e"1 

This  is  larger  than  l/2  f  as  stated  above,  but  is  attributed  to 
the  distortion  near  the  ends  due  to  the  element  of  approxima- 
tion involved  in  using  integration  for  summation. 

If  mutation  is  occurring,  however  low  the  rate,  the  decline 
in  heterozygosis  cannot  go  on  indefinitely.   There  will  come  a 
time  when  the  chance  elimination  of  genes  will  be  exactly 
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balanced  by  new  genes  arising  by  mutation.   The  equation  to  be 
solved  is  equation  (9).   By  trial  and  error,  Wright  (193D  finds 
0(q)  =  C-^q   +  02(1  -  q)  '  as  a  solution.   The  terminal  condi- 
tion, reduction  of  the  class  of  fixed  genes  by  an  occasional 
mutation  contributing  to  the  class  q  =  ( 2N  -  1)/2N,  necessarily 
involves  the  appearance  of  new  genes  contributing  to  the  class 
q  =  l/2N.   This  means  that  only  the  symmetrical  solution 
JZf(q)  =  Cq~  (1  -  q)    can  be  accepted  as  descriptive  of  the  dis- 
tribution of  the  entire  array  of  genes  at  equilibrium,  provided 
there  is  no  selection,  migration,  or  recurrence  of  the  same 
mutation.   Thus  letting 

f  =  —  q"1(l  -  q)"1   and  /f  =  1, 
2N 


C  = 


.577  +  log(2N  -  1) 


2  log  3.6N 


(10) 


Before  attainment  of  equilibrium  with  respect  to  heter- 
ozygosis the  distribution  will  pass  through  phases  of  approxi- 
mately the  form  jzf(q)  =  C,q   (1  -  q) "   +  C.,  in  which  the  term 
Cn  gradually  displaces  Co  as  the  number  of  temporarily  fixed 
genes  approaches  equilibrium  with  mutation.   As  the  chance  of 
complete  fixation  increases,  the  chance  of  mutation  must  be 
taken  into  account.   The  distribution  passes  through  phases  of 
the  type  02(1  -  q)    +  C*,  C2  gradually  displacing  C,,  rela- 
tively, but  itself  declining  as  the  chance  of  complete  loss 
increases. 
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If  there  is  reverse  mutation,  but  at  a  very  slow  rate,  a 
term  C-.q"1  must  be  added  to  the  formula,  and  an  equilibrium  will 
be  reached  in  the  form  Cq_1(l  -  q) "  .   Thus  in  the  long  run, 
the  gene  will  be  completely  fixed  or  completely  absent  from  the 
population  with  frequencies  proportioned  to  the  mutation  rates 
to  and  from  the  gene  respectively.   Occasionally  these  condi- 
tions will  not  be  quite  complete  and  at  extremely  rare  inter- 
vals the  gene  will  drift  from  one  state  to  the  other. 

The  turnover  among  genes  in  equilibrium  in  the  distribution 

Cq_1(l  -  q)"1  can  be  determined  by  consideration  of  the  variance 

of  q  and  independently  by  application  of  the  Poisson  law.   Let 

Y(q  -  1/2)  2f 
£q2  =  ±= — ^ be  the  variance  of  q,  excluding  the 

terminal  classes.   This  variance  is  increased  in  the  following 
generation  by  the  spreading  out  of  each  frequency  class  as  a 
result  of  random  sampling.   The  variance  from  the  spreading  of 
a  single  class  is  q(l  -  q)/2N  and  the  average  is  thus 


.1 


r  q(l  -  q)f   1   1     9     2N  -  1 
2NVf       2N  14-  (2N)2 

where  C  is  as  in  (10).   The  sum   6q2  +  A<^2  includes  the 
newly  fixed  factors  whose  contribution  is  l/lj.  k  where  k  is  the 
rate  of  fixation,  plus  loss,  but  excludes  mutation.   The  con- 
tribution of  the  new  mutations  to  the  variance  is 

k(N  -  l)2 

;  therefore 

(2N)2 


Ik 


9?  +  A6~q2  -  -  k  +  k  l^A    =    6h2 

k  2N 


K   =  C   = 


2    log   3-6N 


The  proportion  exchanged  at  each  extreme  is  thus  about  half  the 
corresponding  subterminal  class  when  N  is  large.   This  compares 
with  the  proportion  as  determined  by  the  Poisson  law,  which  is 
.J4.6  times  the  subterminal  class  instead  of  .50. 

Referring  to  Fisher's  equations,  (l)  and  (2),  Wright  made 
the  following  remarks.   He  claims  that  equation  (2)  gives  the 
wrong  solution,  and  he  also  points  out  a  comparison  of  the 
equations.   He  states  that  in  a  breeding  population  of  one  mil- 
lion with  one  mutation  per  1000  individuals,  Fisher's  formula 
Jti/2   N  '       (i  gives  1,250,000  unfixed  factors  with  a  turnover  of 
.08  per  cent,  while  his  formula  2Nu-  log  3^&N,  gives  30,000 
unfixed  factors  and  a  turnover  of  3«3  per  cent. 

Fisher  yielded  to  Wright,  and  Wright  (193D  printed  a  note 
from  Fisher  to  this  effect.   Equation  (2)  should  have  read 

^-  =  -  ^  (y  cot  e)  +  -  ^-5 

QT    l\N    O©  k$    O9 

and  with  this  he  agrees  with  Wright's  value  of  l/2N. 
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MODERN  APPROACH  TO  GENETIC  DRIFT 

Kimura ' s  Treatment  of  Random  Genetic  Drift  Due 
to  Random  Sampling  of  Gametes 

As  was  noted  before,  Fisher  (1921)  and  Wright  (1931)  gave 
solutions  to  this  problem.  Fisher  used  differential  equations 
and  Wright  used  differential  equations  and  path  coefficients. 

Kimura  (1955c)  states  that  in  these  works  it  was  assumed 
that  a  state  of  steady  decay  had  been  reached.   Nothing  was 
known  about  the  complete  solution  which  might  show  how  the  pro- 
cess finally  leads  to  the  state  of  steady  decay.   Kimura  showed 
that  the  process  approaches  asymptotically  the  state  of  steady 
decay  by  finding  the  moments  of  the  distribution  and  using  the 
Fokker-Planck  equation. 

Again  considering  a  finite  random  mating  population  of  N 
diploid  parents,  where  A-,  and  A2  are  a  pair  of  alleles  with 
frequencies  x  and  1  -  x,  respectively,  when  there  is  no  selec- 
tion, mutation,  or  migration,  Kimura  (1955c)  states  that  an 
adequate  description  of  the  change  in  gene  frequencies  is  to 
give  the  frequency  distribution  f(x,  t)  at  the  tth  generation, 
where  x  takes  on  a  series  of  discrete  values:   0,  l/2N,  2/2N, 
.  .  .,  1  -  l/2N,  1.   Without  serious  error,  x  can  be  con- 
sidered continuous  for  large  N. 

First  of  all,  Kimura  (195^)  gave  as  the  n  moment  of  the 
distribution  about  zero: 
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n-1  (n-2)(n-l) 

Jin'(t)  =  p  -  3pq  (X-X1)Z    -  5pq(p-q)  (I'M 

n+1     X  (n+l)(n+2) 


(n-3)(n-2)(n-l)         , 

-  7pq(-5pq  +  1)  (1  -  *o) 

(n+l)(n+2)(n+3) 


_  (n-l4.)(n-3)(n-2)(n-l)        . 

-9pq(l^pq^  -  7pq  +  P  -  q)  {l~X0 

(n+l)(n+2)(n+3)(n+lj.)     ^ 


[«i  -  */] 


+  *   |(i  -  M  (ii) 

id  +  D 

where  q  =  1  -  p  and  X.    =  ,  i  =  1,  2,  .  .  . 

1      k$ 

Using  a  more  sophisticated  method,  Kimura  (1955a)  presented  the 
following:   Let  x.  be  the  gene  frequency  in  the  t   generation, 

and  let  6x^   be  the  amount  of  change  due  to  random  sampling  of 
gametes  per  generation,  such  that 

xt+l  =  xt  +  5xt  '  (12) 

Let  n  '(t+1)  =  E(x^,n)  be  the  nth  moment  of  the  distribution 

about  zero  in  the  (t  +  1)    generation.   He  then  writes 
E(xt+1)  in  terms  of  ( x^.  +  Sx^.)  .   This  is  done  in  two  steps; 
first,  taking  expectation  for  the  random  change  6x. ,  which 
will  be  denoted  by  Eg,  and,  second,  taking  the  expectation 
for  the  existing  distribution,  denoted  by  E_;;_. 

Note  that  E6(6xt)  =  0,  E5(6xt)2  =  xfc(l  -  xt)/2N,  etc., 
so 
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Hn'(t+1)    =E(xt   +   6xt)n 


E*[xt  +  Q  ^  Es{6xt]  +  Q 


'n\   n-2 
x. 


E5(6xt)   +  . 


n   n(n"1)   n-2   xt(!-xt) 
E  „  I  x .  +  x     +  .< 


K 


2N 


y  • 


13) 


assuming  that  N  is  large  enough  so  that  terms  of  l/N  and 
higher  can  be  omitted  without  serious  error.   The  equation  is 
then: 


n(n  -  1) 


r    n(n  -  in 

=  L1  -  "tH  ** 


.*  + 


n(n  -  1) 
UN 


,,(t) 

Ln-1 


Uk) 


For  large  N  the  moments  change  very  slowly  so  equation  (llj.)  is 
replaced  by  the  system  of  differential  equations. 

(t) 


da' 
^n 


n(n_1)  r  ,(t)      ,(ty 


dt 


kN 


,U)    ,(t)"|   ,  _  . 

r  n    "  'n-l    '  (      '   '  3'  '  *  *J  ' 


(15) 


If  the  population  starts  from  gene  frequency  p  (0  <  p  <  1) , 
^i  \    )    _  pn  Qn(j  ^g  n   moment  is  a  solution  of  (15)  • 


M-'n    =  P  +  ^-  (2i+l)pq(-D1  F(l-i,  i+2,  2,  p) 


i=_ 


x 


(n-1)  .  .  .  (n-i) 
(n+1)  .  .  .  (n+i) 


^[id+D/lti]  t 


(16) 


where  P(l  -  i,  i+2,  2,  p)  is  the  hypergeometric  function. 
For  finite  n  the  series  is  finite. 
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He  next  derives  the  probability  f(l,  t)  of  the  gene  A^ 
becoming  fixed  in  the  population  by  the  tth  generation.   Note 
that 


1  tt\ 

f(l,  t)  =   lim   ^  *nf(x,  *)  "    lim  V- n 
n  -*■  o°   x=0  x  -*  ** 


He  now  has  an  infinite  series 

f(l,t)  =  p  +  7  (2i+l)Pq(-l)i  F(l-i,i+2,2,p)e-  l^i+^M   t 

(17) 
whose  convergence  must  be  examined.   At  this  point  he  intro- 
duces the  Gegenbauer  polynomial  T.   (z)  which  is  related  to  the 
hypergeometric  function  by 

-,        i(i  +  1)  1  -  z 

TT  n(z)  =  P(I  +  2,    1  -  i,  2,  )  . 

i-i         2  2 

(1  -  r) 

Using  this  relation  and  putting  p  =  ,  where 

2 

(-1  <  r  <  1) ,  he  obtains: 

rr  .  (2i+l)     2  1,   -  ri(i+i)AN|  t 

f(l,t)  =  p  +  2-  (-D  (l-r)Ti,(r)e  Lv    '^i       . 

i=l       2i(i+l) 

(18) 

Using  the  recurrence  relation, 

(2i+l)(l-r2)Tj_1(r)  =  i( i+1) Pi_1(r)  -  i( i+1) Pi+1( r)     (19) 
the  above  formula  becomes: 
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f(l,t)  =  p  +  J  ^  [Pt-^D  -  P1+1(rj  a'  \}^^M  * 

(20) 
where  P  (r)  represents  a  Legendre  polynomial.   For  t  =  0, 
the  partial  sum  of  the  first  n  terms  of  equation  (20)  is 

(-I)"'1  (P°-1  '?n)  (,23). 
2 

To  obtain  the  probability  of  gene  A2  being  fixed,  f(0,  t) 

is  obtained  by  replacing  p  with  q  and  r  with  -r. 

In  the  notation  of  equation  (11)  he  has 

f(l,t)  =  p  -  3pq(l-A1)t  -  5pq(p-q)(l-^2)t 

-  7pq(-5pq+D(l-A3)t  -  9pq(ll;pq2-7pq+p-q)(l-X^)t 

.    (21) 


+  ^ 


(1  -  \3) 


and  again  f(0,  t) ,  the  probability  of  complete  loss,  is  found 
by  replacing  p  by  q. 

He  now  has  the  probability  for  the  fixed  classes  and  he 
makes  this  statement 

f(l,t)+f(0,t)  =  1  -  I[p2j(r)  -  P2j+2(r)]  e"  [^^H* 

(22) 

which  is  0  when  t  =  0  and  tends  to  1  when  t  — *•  «o  . 

He  then  considers  the  probability  distribution  of  unfixed 

classes.   The  variance  of  the  rate  of  change  in  gene  frequency 

x(l  -  x) 

due  to  random  sampling  of  gametes  is  V~   =  .   So  if 

0X      2N 

$(x,    t)  is  the  relative  probability  that  the  frequency  of  the 
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gene  in  the  population  will  take  any  value  between  x  and 
x  +  dx  (0  <  x  <  1)  in  the  tth  generation,  0(x,    t)  satisfies 
the  partial  differential  equation  derived  from  equation  (l\9)  • 
(See  Appendix. ) 


xd  -  x)m  (23) 


To  solve  this  equation  he  uses  tf^X^ix)  e      x      or 

X^xJ-e"  [iti+DAN]  t   8nd  this  gives  the  hypergeometric  equation 

d2X.  dX± 

X(1_X)  i  +  2(l-2x)  — -  -  (l-i)(i+2)Xi  =  0 

dx2  dx 

or  using  x  =  (1  -  z)/2  such  that  z  =  1  -  2x  gives  Gegenbauer 
equation 

d2X.      dXi 

(z2-l) +  i^z  — i  -  (i-l)(i+2)X1  =  0  (2k) 

dz2       dz 


Looking  at  equations  (16)  and  (17),  he  derives  the  moment 

■ 
formula 

,(t)  _  xnf(1  t)  =     xn  |2f(x,  t)dx 
n  Jo 

which  suggests  a  solution  of  equation  (23)  of  the  form 

i=l 

Comparing  this  with  equation  {2k),    it  was  found  that  a  solution 
for  (2k)    is  the  Gegenbauer  polynomial  X.  =  T.  ,  (z).   Thus 
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*(«,t)   =   X    ClTi.l(t)    e-[i(i+D/W]t  (26) 

i=l 

He  gives  this  method  for  solving  for  the  C^.   Since 

initial  gene  frequency  of  population  is  p,  then 

6(x  -  p)  =  /_  C.t!  -(b) 
£=1   x  1_x 

where  5(x)  represents  the  delta  function.   Multiply  both  sides 
by  (1  -  z2)T.  .(z)  and  using  orthogonal  property, 

f1     o   ,  2(1+1)  i 

(1-z2)T*(z)t!   (z)dz  =  6      ,        (27) 

J.x        m    1  X         m'1  X  (21+1) 

where  m  in  Kronecker's  notation  represents  zero  or  a  positive 
integer;  thus 


r      ?~\     .         2(1+1)1 

2  1  -  (l-2p)^   T{_1(l-2p)  =  C1 


(21+1) 

(21  +  1) 

C.  =  l|Pq  T'   (1  -  2p)  .  (28) 

1       9(1  +  1)   1_1 

Some  of  these  values  are  given  by  C-j_  =  6  pq, 

C2  =  -30pq(p  -  2),  C3  =  81|.  pq(-5pq  +  D  • 

The  formal  solution  is 

,«•„„  .  I  (21+1)(1-r2)  T:  lM  t:  l(z)  .-  &(^)/Ht   (29) 

1=1    l(l+l) 
or  in  terms  of  hypergeometric  function, 
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0(x,t)    =  l_    pqi(i+l)(2i+l)F(l-i,i+2,2,p)F(l-i,i+2,2,x) 
i=l 


e 


[i(i+l)/i^]t  (  (30) 


dPi(z) 

By  noting  that  =  T|_1(z)  and  Pn(l)  =  1,  he  gives 

dt 

the  possibility  that  both  A-,  and  A2  coexist  in  the  tth  generation. 
£lt   =  j   0(x,t)dx  =  J   tf(x,t) 


dz 
2 


.  J    "JJ"1>(1'r2>  ,'   .  (,,  e"  f2-11^  *   (3D 
iPl    (2m-l)2m    dm~d 


for  t  >  0,  the  series  is  convergent  and  as  t-*-00  AL.    b 


e comes 


zero.   He  then  gives  the  proof  that  when  t  =  0,  the  series  con- 
verges to  1,   If  _fL  0   is  the  partial  sum  of  first  n  terms, 
then  by  a  recurrence  relation 

=  P2m-2(r>  -  P2m(r> 


(l+m-l)(l-r2)T2m_2(r) 


(2m  -  l)2m 

il0,n  =  1-P2n(r)' 

Using     Pn(z)    =  -    |         U   +     Jz2   -    1   cos    t        dt 
7i  Jo      L  J 

he   shows   that  for    |r|    <    1,    P2n^r^  "*     °   as  n~^  "^ 

P2n(p)   —    "  r  +     Jv     -    1   cos    t  dt 

K  JO 
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■;f[ 

it  Jo   L 


r2  +  (1  -  r2)cos   t    dt  — ►  0  as  n  -*  oo   . 


Also 


■  Y  fD   .  ■    D     ,  ,1    -  |(2j+l)  C2j+2)/Uu]  t 

-fi.t=  4  LP2J(r)  "P2j+2(r)j   6  L 

from  equation  (31)  which  says  f(l,t)  +  _flt  +  f(0,t)  =  1  from 
equation  (22).  For  t  >  0  the  series  is  seen  to  be  convergent 
and  as  t-»°°  ,  -^-t  — >0*  Siving  the  asymptotic  formula 

-(l/2N)t  (     j 


_n  t  ~  6pq  e 


and  for  t  =  0,  -H-t  converges  to  1. 

Finally,  from  equation  (29)  we  have  the  probability  of 
heterozygosis, 

H+.  =  /    2x(l  -  x)0(x,  t)dx 

.  ?  Pq^ilT:  l(i-2p)  f1  (i..2)fi  l(.).-t<*+1)>H*d.. 

feL    i(i+l)  1_1     J.-l      1_i 

From  equation  (27)  where  m  =  0,  the  integral  above  is  zero 
except  when  i  =  1.   Hence 


3    -,    **•    -(l/2N)t    0    -(l/2N)t   „ 
E±.  -   pq  •  —  •  1  •  —  •  e   '     =  2pqe   /     =  HQe 


(32) 

and  this  shows  that  heterozygosis  decreases  at  the  rate  l/2N 
per  generation.   This  is  the  exact  result  of  Wright  and  Fisher's 
corrected  result  as  given  previously. 


2k 


Kimura  gives  a  short  proof  that  this  is  valid.   If  p  is  the 
frequency  of  A±   and  qp(l  -  p)  is  the  frequency  of  heterozygotes, 
then  if  p  +  6p  is  the  change  in  p  for  one  generation,  1  -  p  -  6p 
is  the  change  in  1  -  p  for  that  generation.   The  expected  value 
of  the  heterozygotes  is 

2 


E  t(p  +  6p)(l  -  p  -  6p)   =  2p(l  -  p)  -  2E(6p) 


=  2p(l  -  p)  -  2 


p(l  -  P) 
2N 


1  -i 
2N- 


2p(l  -  p) 


as  was    to   be    shown. 

Again  going  back   to   the   notation  of   equation    (11)    he  writes 

*  ,  -(l/2N)t        ,.  ,    ,        ^    -(6/21)  t  r-(l5/2N)t] 

J[lt  =  6pq  e        /  +   11+.  pq(-5pq+De  +  »<  I  e 

(33) 

and  also  the  variance  of  the  distribution  in  the  t   generation 

is  from  equations  (21)  and  (25), 

-(l/2N)t  (Vl) 

V^  =  pq  -  pq  e         .  IW 

This  says  that  the  variance  approaches  its  limiting  value  pq 
at  the  rate  l/2N  per  generation. 

Kimura  (1955b)  also  considers  the  case  where  N  is  changing 
gradually  from  generation  to  generation  in  a  deterministic  way 
such  that  Nt  can  be  represented  as  a  continuous  function  of  t. 


In  this  case  equation  (31.1)  becomes: 

dt/2Nt 
J\.    ~    6pq  e   °  (314..1) 


-r 
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and  equation  (32)  becomes: 


-f 


dt/2Nt 
Ht  =  2pq  e  "°  (3^.2) 

Thus  a  necessary  and  sufficient  condition  that  for  a 

growing  population,  Ht  and  {\  t  to  vanish  at  the  limit  when  t 

f"°   dt 
becomes  °°     is  that  the  integral  J    —  diverges,  i.e.,  Nt  must 

J0   Nt 

be  at  most  of  the  order  of  T  at  the  limit.   If  the  population 
increases  more  rapidly,  heterozygosis  cannot  be  eliminated 
entirely. 

On  the  other  hand,  if  N  changes  stochastically  around  its 
mean  N  with  sufficiently  small  deviations  compared  with  N  and 

if  these  deviations  are  mutually  independent,  then  N  in  equa- 

_   VN 
tions  (31.1)  and  (32)  should  be  replaced  by  N  -  3-  ,  where 

N 
VN  is  the  variance  of  N. 

Random  Genetic  Drift  Due  to  Random  Fluctuation 
of  Selection  Intensities 

Kimura  (195^4-)  considers  a  pair  of  alleles  lacking  domi- 
nance and  the  process  of  change  of  their  frequencies  when  their 
selection  coefficients  fluctuate  fortuitously  from  generation 
to  generation  around  a  mean  zero. 

Consider  a  large  random  mating  population  where  the  effect 
of  random  sampling  of  gametes  is  negligible  with  alleles  A^  and 
A2.  If  x  is  the  relative  frequency  of  the  gene  A-^  in  the  popu- 
lation and  s  is  the  selection  coefficient  of  A-^,  then  the  rate 
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of  change  of  gene  frequency  due  to  selection  is  approximately 

6x  =  sx(l  -  x) 

per  generation,  when  s  is  small.   If  there  is  random  fluctua- 
tion in  the  selection  intensity,  s  and  6x  are  random  variables. 

Let  the  mean  of  s  be  s  and  its  variance  V  .   Then  the  mean 

s 

of  6x  is 

Mgx  =  s  x(1  -  x) 
and  the  variance 

V6x  "  Vs  *2d  "  *)2' 

Thus  we  would  expect  a  certain  irregularity  in  the  process  of 
change  in  gene  frequency  from  generation  to  generation. 

When  the  rate  of  change  is  small,  this  process  may  be 
treated  as  a  continuous  Markov  process.   If  x  is  the  gene  fre- 
quency at  the  t**1  generation  and  the  function  ${  x,  p;  t)  de- 
notes the  density  of  the  conditional  probability  that  the  fre- 
quency lies  between  x  and  x  +  dx  at  the  t*  generation  given 
that  the  initial  gene  frequency  was  p  at  t  =  0,  we  have 

6tf(x,p;t)    1  Q2   r  -.     A  r  -| 

=r =  -  ^—  V5x  i2f(x,p;t)    -X-     M6x0(x,p;t)   .   (35) 

0t  2  Qx^  L  J     Ox  L  -» 

This  equation  is  known  as  "Kolmogorov1 s  forward  differential 
equation"  and  also  as  the  "Pokker-Planck  equation".   Wright 
(19/^5)  was  the  first  to  apply  this  equation  to  population 
genetics.   (See  Appendix.) 
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The  left-hand  side  of  this  equation  represents  the  rate  of 
change  of  the  relative  probability  of  any  class  per  generation 
and  can  be  written  as  the  two  terms  on  the  right.   Of  the  terms 
on  the  right-hand  side,  the  first  is  due  to  random  fluctuation 
and  the  second  is  due  to  the  directed  change. 

Making  the  substitutions  for  Mg   and  V"5x,  equation  (35) 
becomes: 


8^    vs  & 


,  a 


S-— S   2-    [x2(l-x)2|rf|    -I    S_    [x(i-x)tf],    (0<x<l)        (36) 
Ot       2     <jx2    L  J  (jx    L  J 


Let 


*  .  2x(l+»lA)-2    (l-x)<1+3l/2)-2  e-UlU 


and 


■:J 


ll  +   tanh   (6/2)1 


where    tx  =   (tVg)/2   and   s1  =    (2s)/V£ 


This  reduces  (3&)  to 


d2U 


d9' 


9  i 


r    l+a-,1-   sx 

X = tanh  (-) 

L      Ij.     2        2 


U  =  0,  (-«>  <  0  <  °°) 


Kimura  (1955)  gives  the  following  two  independent  solutions. 

~lb 


U+  = 


,6 


1  +  e 


e 


f    L 


1  +  e 


e 


ee 

F(a+b,  a+b+1,  l+2a,  -) 


1+e 


6' 
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U   = 


where 


1  +  e< 


1  +  e' 


ee 
F(-a+b,  a+b+1,  l-2a,  -) 


1+e 


e 


(1  -  s- 


'1  +  s- 


a  = 


-  A   and  b  = 


\    . 


If  the  gene  A    is  randomly  selected  such  that  the  mean 
value  of  its  selection  coefficient  is  zero  if  taken  over  very- 
long  periods  of  time,  then 

M6x  "  °> 

V$X=  Vsx2(1  -  x)2  > 
■where  Vg  is  the  variance  of  s;  thus  equation  (35)  reduces  to 

a^  _  v3  a2 


dt    2    3x2 


x2(l  -  x)2(Zf 


(37) 


This  equation  has  singularities  at  the  boundaries  so  that 
no  arbitrary  conditions  can  be  imposed  there,  but  he  shows  that 
if  an  initial  condition  ^(x,  p;  0)  is  given,  a  continuous 
stochastic  process  satisfying  equation  (37)  can  be  uniquely 
determined. 

Still  considering  the  case  of  no  dominance,  Kimura  (1952) 
makes  the  transformation 


z  =  log  ( )  . 

1  -  x 
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Then  the  rate  of  change  of  the  value  of  z  per  generation  becomes 
6z  =  s.   If  the  gene  frequency  in  the  population  is  measured  by 
the  z  scale,  it  changes  continuously  from  -  CO   to    <x>    as  x 
changes  from  0  to  1.   Thus  the  distribution  of  z  is  approxi- 
mately normal.   The  mean  and  variance  of  6z  are  equal  respec- 
tively to  the  mean  s  and  variance  Vg  of  the  selection  coef- 
ficient  s. 

It  follows  that  by  using  the  same  transformation  he  was 
able  to  solve  equation  (37).   Let 


u=l/2  e(V^8)t  x3/2  (1  -  x)3/2* 


and 

x 


z  =  log  ( ) 

1  -  x 

The  result  is 

8t    2    3z2 


(38) 


This  equation  is  also  known  as  the  heat  conduction  equation 
and  it  is  already  established  that  there  is  a  unique  solution 
which  is  continuous  for  -  °o   to  +  <co  when  t  ^  0  and  reduces  to 
u(z,  0)  when  t  =  0. 


u(,,t)  =  _±_  (    e-'^  /"■'  „(p.o)dp 


'-  OO 


Then  if  the  initial  distribution  of  gene  frequencies  (2f(x,  p;0) 
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is  given  and'  after  making  substitutions,  we  have  the  unique 
solution  which  satisfies  (37)  and  is  continuous  between  0  and  1, 


,-(Vs/8)t 


exp 


log 


x(l-y)' 


,-,2-1 


i^(x,p;t)  =   — 7-TJ2 

J2^t       [x(l-xj]  *'*  J0 

x  yy(l  -  y)  !^(y,0)dy  . 


(l-x)y 


2   Vgt 


(39) 


On  the  other  hand,  if  the  initial  condition  is  not  a  con- 
tinuous distribution  0(x,  p;  0),  but  is  a  given  gene  frequency 
Xq,  then  the  relative  probability  that  the  gene  frequency  in 
the  tth  generation  will  be  between  x  and  x  +  dx  is 


0(x,p;t)  = 


exp 


x/2ttV3t 

"  |x(l  -  x)]  3/2 


If! 

8 
1/2 


log 


x(1-xq) 


(l-x)x0 


2  Vat 


(14-0) 


If  x  =  .$,    the  distribution  curve  becomes  unimodal  if  the 
number  of  generations  is  less  than  4/3V  ,  but  becomes  bimodal 
if  it  exceeds  this  value.   (See  Pig.  3.) 

The  mean  of  the  distribution  is  always 
Pi 


XQ 


xj#(x,  t)dx 


(W 


but  the  variance 


Vt  =  /    (x  -  xq)2  0(x,  t)dx 

Jo 


(1*2) 
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Pig.  3.   Illustration  of  the  process  of  change  in  the 
distribution  of  gene  frequencies  with  random  fluc- 
tuation in  the  selection  intensities.   It  is 
•assumed  that  there  is  no  dominance,  the 
initial  gene  frequency  is  .$,    and  the 
variance  of  the  selection  coef- 
ficient is  .Oij.83. 
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increases  in  successive  generations. 

It  can  be  represented  asymptotically  for  large  t 

/  nxo(l-xo)   -Vat/8 
Vt  =  XQd  -  Xq)  -/ e   3  /   +  ^ 


2Vst 


t  Jt 


(k3) 


Thus  as  t  becomes  very  large,  V.  is  very  close  to  V3/8. 

A  highly  complicated  treatment  of  the  terminal  parts  of 
the  distribution  is  given  in  Kimura  (195#J-)j  pages  286-289. 

Comparison  of  Two  Methods 

Wright  has  repeatedly  emphasized  the  evolutionary  signifi- 
cance of  random  drift  in  a  natural  population  which  is  sub- 
divided into  many  partially  isolated  subgroups.   His  theory  is 
accepted  by  many  evolutionists.   On  the  other  hand,  Fisher  and 
Ford  (I9i|.7)  emphasized  the  prevalence  of  drift  due  to  fluctua- 
tion of  selection  intensities  and  challenged  the  theory  of 
Wright  by  denying  any  significance  of  random  drift  due  to  small 
population  numbers  in  evolution.   This  led  to  experimental 
studies  by  members  of  the  school  of  Fisher  and  Ford  (Sheppard, 

195D. 

In  spite  of  all  of  the  experimental  studies,  no  mathe- 
matical analysis  was  made.   This  prompted  Kimura  to  make  the 
studies  as  mentioned  before.   With  his  results  he  makes  the 
following  comparison. 

1      vs 

-  =  -  (W 

2N     8 
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Although  this  is  a  rather  restricted  formula,  it  could  be  used 
to  calculate  N  or  Va  if  one  or  the  other  is  known. 

The  effect  produced  by  the  random  fluctuation  in  natural 
selection  is  stated  as  being  of  relatively  little  importance 
for  small  populations.   However,  in  large  populations  it  has 
a  remarkable  effect  that  in  the  case  of  no  dominance,  the  dis- 
tribution curve  is  modified  markedly  in  the  parts  where  the 
frequency  of  either  allele  is  low. 

Another  comparison  between  drift  due  to  random  mating  in 
small  populations  and  due  to  fluctuation  of  selection  intensi- 
ties is  that  when  due  to  finite  size,  the  gene  in  question  may 
indeed  be  lost,  while  if  due  to  the  latter  case  the  gene  may 
reach  an  equilibrium  near  the  fixation  point,  called  quasi- 
fixation.   This  is  the  asymptotic  case  as  noted  before. 

Fixation  of  Mutant  Gene 

In  large  natural  populations,  gene  mutations  may  be 
occurring  in  each  generation.   While  most  of  the  genes  are 
deleterious,  some  turn  out  to  be  advantageous.   These  advan- 
tageous mutant  genes  have  a  tendency  to  increase  their  frequen- 
cies in  later  generations,  and  thus  have  a  chance  for  estab- 
lishment even  in  large  populations.   Wright  (1931,  191+2)  studied 
this  problem  and  gave  some  solutions.   Kimura  (1957)  and  Robert- 
son (i960)  presented  solutions  under  general  conditions  for 
the  probability  that  a  mutant  gene  would  become  fixed  in  a 
population. 

Equation  (I4.9)  takes  the  form 


3k 

An         p(l-p)  A2U  r  n  /^U 

V  =  ^—  +  sp(l-p)   h  +  (l-2h)p  ^-  (l&.l) 

Ot    I4U   Op2        l        jOp 

where  the  selective  advantage  of  mutant  homozygote  is  s  and 
that  of  heterozygote  is  sh.   The  solution,  u(p,  t) ,  is  the  prob- 
ability that  the  mutant  gene  reaches  fixation  by  the  ttn  gen- 
eration, given  that  its  initial  frequency  is  p.   This  prob- 
ability is  equivalent  to  that  of  equation  (17) • 

Kimura  (1957)  defines  the  probability  of  ultimate  fixation 
by 

u(p)  =   lim   u(p,  t)  . 

For  the  neutral  mutant  gene,  u(p)  =  p.   If  v  is  the  initial 

v 
number  of  mutant  genes,  u(p)  =  —  and  the  probability  of  fixa- 

2N 

1 
tion  per  mutant  gene  is  —  . 

2N 

For  the  general  case  Kimura  (19f?7)  sets  =—  =  0,    and  obtains 


-2Ns(2h-l)x(l-x)-2N3X 


Jo 

u(p)=— (kk>2) 

P1   -2N3(2h-l)x(l-x)-2Nsx 

e  dx 


where  2h  -  1  is  the  measure  of  dominance. 

The  following  are  more  simplified  equations  for  ultimate 
frequency  at  fixation. 
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For  additive: 


u(p) 


0  l  -   e 


1        ...  n    _    0-2Ns      * 


■2N 


3X 


1   -   e'^s 


J  0 
For  recessive: 


JO 
u(p)    =  — 


e        °^  dx 


'2Nax      , 

sx;     dx 


1      -2NSx2    . 
e  dx 


Jo 

For  dominance: 

'0 


e  "■--      '    dx 


u(p)    = 


1      2Nsx(x-2) 
e  dx 


0 


By  expanding  each  of  these  by  the  Taylor  series  and  looking 
only  at  the  first  two  terms,  since  others  will  be  very  small, 
for  small  Ns  one  obtains: 
for  additive: 

u(p)  =  p  +  p(l  -  p)Ns; 

for  recessive: 

2  P 

u(p)    =   p  +  -   p(l   -    p^)N    ; 
3 

and   for  dominance: 


36 


2 
u(p)  =  p  +  _  p(p  -  i)(p  -  2)N_. 

3 


APPLICATIONS  AND  EXAMPLES 


Kerr  and  Wright  (195&J-)  made  a  three-part  study  of  genetic 
drift  presented  in  three  continuous  articles.   In  the  first,  a 
study  of  genetic  drift  due  to  inbreeding,  he  used  the  trait 
"forked".   The  other  two  experiments  were  with  "Bar"  and 
"spineless".   It  is  stated  that  for  the  forked  case,  the  selec- 
tion differential  is  much  less  than  ten  per  cent  so  that  the 
results  illustrate  random  drift  from  inbreeding  in  an  almost 
pure  form.   Of  96  lines  carried  to  fixation  or  to  16  genera- 
tions, f*  became  fixed  in  I4.I  lines,  f(forked)  in  29  lines,  and 
26  lines  were  still  unfixed.   The  conclusion  was  that  the  amount 
of  selection  against  forked  is  slight. 

The  Bar  experiment  was  more  extensive  and  use  was  made  of 
the  Pokker-Planck  equation.   One  hundred  eight  small  popula- 
tions were  used  and  little  selective  mortality  was  found  but 
severe  selection  against  Bar  from  low  productivity  of  homozygous 
Bar  females  and  Bar  males.   Starting  from  50  per  cent  Bar  genes 
in  each  case,  the  distribution  soon  reached  approximate  stability 
of  form  (about  four  generations)  as  type  came  to  be  fixed  at  a 
rate  of  22  per  cent  per  generation  and  Bar  at  a  rate  of  0.7  per 
cent  per  generation.   After  generation  10,  type  had  been  fixed 
in  95  lines,  Bar  in  three,  and  10  were  still  unfixed.   The  form 
of  the  distribution  agreed  well  with  that  expected  from  a  popu- 
lation of  effective  size,  72  per  cent  of  actual  size,  and  an. 
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empirically  determined  rate  of  change  of  the  frequency  ( q)  of 
Bar,  5q  =  -.35q(l  -  q) • 

Crow  and  Morton  (1955)  derived  a  formula  for  the  variance 
of  random  drift  of  gene  frequency  and  for  effective  population 
number.   If  NQ  is  effective  size,  then  this  variance  is 
q(l  -  q)/2N,  where  q  is  the  frequency  of  allele  under  discus- 
sion.  The  formula  derived  is 


q(l  -  q) 
V5q  =  J 


Vk 
1  -  P»  +  (1  ■+  P»)  — 

^k 


where  N  is  total  number  of  offspring,  |x,  and  V,  are  the  mean 
and  variance  of  the  number  of  surviving  offspring  per  parent, 
and  P'  is  Wright's  coefficient  of  inbreeding.   Also 


2N 


Ne  = 


vk 
1  -  P'  +  (1  +  P>)  — 

^k 

They  also  indicate  that  V|_/nj_  is  a  measure  of  the  degree 
of  departure  from  idealized  conditions  and  thus  propose  that 
this  ratio  be  used  as  an  index  of  variability  in  progeny  number. 
The  authors  then  give  an  account  of  an  experiment  with  drosophila 
in  which  they  applied  these  methods. 

In  a  small  population  experiment  Merrell  (1953)  followed 
gene  frequency  changes  in  sex-linked  recessive  genes  of 
Drosophila  Melanogaster.   Population  sizes  were  from  10  to  100. 
The  percentage  of  wild  type  flies  rose  rapidly  and  remained 
above  90  per  cent,  while  some  strains  decreased  in  frequency.' 
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Large  fluctuations  occurred  due  to  genetic  drift,  in  some  cases 
leading  to  loss  of  the  recessive  gene.   The  results  were  inter- 
preted as  due  to  the  combined  effects  of  natural  selection  and 
genetic  drift. 

Spencer  (19^7)  analyzed  a  sample  of  110  wild  flies  showing 
a  frequency  of  10  per  cent  for  the  gene  "stubble  bristles".   In 
a  sample  of  identical  size,  collected  at  a  point  almost  one- 
fourth  mile  distant  from  the  first  collection  area  and  two  years 
later,  he  found  the  gene  frequency  seven  per  cent  for  the 
"stubble"  bristle.   The  genes  "brick"  eye  color  and  "dubonnet" 
eye  color  were  also  recovered  more  than  once  in  both  samples. 
The  concentration  of  these  genes  in  the  population  is  explained 
as  caused  by  genetic  drift  brought  about  by  seasonal  fluctua- 
tions of  population  size. 

An  example  of  the  difference  in  large  and  small  popula- 
tions is  shown  in  Pig.  l±. 

Computer  Simulation 

For  the  case  of  a  =  0  given  by  Kimura  (1955),  Barker  and 
Butcher  (1966)  developed  a  Monte  Carlo  computer  program  to  in- 
vestigate qua  si-fixation  of  genes  due  to  random  fluctuation  of 
selection  intensities.   They  start  with  a  gene  frequency  of 
0.9  for  the  desirable  allele  and  a  constant  mean  selection  co- 
efficient equal  to  .01.   They  performed  10  simultaneous  experi- 
ments with  variance  of  selection  coefficient  V  ranging  from 
.02  to  0.2.   In  terms  of  the  probability  of  quasi-loss  of  the 
desirable  allele,  the  results  confirm  the  theoretical  expectation 
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LARGE  POPULATIONS 


SMALL  POPULATIONS 


Pig.  Ij..   Difference  in  gene  frequency  change  when 

comparing  large-  (Jj.000)  and  small  (20)  samples 

of  drosophila,  where  each  sample  is  divided 

into  10  groups.   (Dobzhansky,  1957.) 
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of  Kimura  (19&2).   The  number  of  generations  to  final  stability 
of  quasi-loss  tended  to  increase  as  2s/V3  increased  and  could 
be  expected  to  be  at  least  1000  for  0.5  <  2s/Vs  <  1.0. 

SUMMARY 

Using  a  differential  equation,  Fisher  (1922)  was  the  first 
to  give  a  mathematical  treatment  for  the  problem  of  random 
genetic  drift  in  finite  populations  due  to  random  sampling  of 
gametes.   His  result  for  the  rate  of  decay  of  unfixed  classes 
was  not  correct,  being  only  half  its  true  value.   Wright  (1931), 
using  path  coefficients  and  an  integral  equation,  supplied  the 
first  correct  solution  for  the  state  of  steady  decay. 

In  these  results,  Fisher  and  Wright  both  assumed  that  a 
steady  state  of  decay  had  been  attained,  but  nothing  was  known 
about  how  the  process  leads  to  the  state  of  steady  decay. 
Kimura  (1955),  by  calculating  the  moments  of  the  distribution 
with  the  help  of  the  Fokker-Planck  equation,  obtained  a  solu- 
tion which  assumed  an  infinite  series  under  the  continuous 
model,  showing  that  the  process  approaches  asymptotically  the 
state  of  steady  decay. 

When  there  is  drift  due  to  random  fluctuations  in  selection 
intensity  and  random  sampling,  the  process  of  change  in  gene 
frequency  in  a  population  can  be  represented  by  a  stochastic 
process.   Kimura  (195^-)  presented  an  analysis  for  this  process 
for  the  case  of  no  dominance.   In.  the  case  of  random  drift  in 
small  populations  it  was  found  that  complete  fixation  or  loss 
of  an  allele  would  be  realized.   Complete  fixation  or  loss  may 
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not  be  realized  in  the  case  of  drift  due  to  fluctuation  of 
selection  intensities.   It  is  shown  that  for  large  populations, 
if  a  sufficient  number  of  generations  are  allowed,  a  situation 
will  be  realized  in  which  the  allele  is  either  almost  fixed  in 
the  population  or  almost  lost  from  it.   The  rate  of  decay  per 
generation  is  given  as  Vg/8,  where  Vs  is  the  variance  of  the 
selection  coefficient. 

Kimura  (1954)  also  made  a  comparison  of  drift  due  to 
fluctuation  intensities  with  drift  due  to  random  sampling.   He 
gives  a  rather  restricted  formula  by  equating  the  two  rates 
of  fixation: 


2N     8 


There  are  several  experimental  studies  on  this  subject, 
some  of  which  are  listed,  dealing  with  experimental  animals. 
It  is  noted  here  that  there  have  been  studies  of  genetic  drift 
in  human  populations,  especially  those  of  Glass  (1952,  1954) 
and  Lasker  (1952,  196ij.)  . 

A   derivation  of  the  Pokker-Planck  equation  and  its  use 
in  deriving  the  distribution  function  given  by  Wright  is  given 
in  the  Appendix. 
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APPENDIX 

Derivation  of  Pokker-Planck  Equation  and  Its 

Use  in  Deriving  the  Distribution 

Equation  Given  by  Wright 

Using  a  method  given  by  Kimura  (1955c),  let  (2f(x,  t)  repre- 
sent the  curve  for  probability  distribution  of  gene  frequencies 
at  time  t.   The  distribution  is  approximated  with  histograms, 
each  column  having  width  h,  as  shown  in  Pig.  $.      The  gene  fre- 
quency of  each  class  is  represented  by  the  middle  point  of  the 
column.   Consider  the  class  with  gene  frequency  x.   For  suffi- 
ciently small  h,  the  area  of  the  column  0(x,  t)h  gives  the  prob- 
ability that  the  population  has  gene  frequency  x  +  l/2  h. 

By  considering  a  small  change  in  time  At,    it  is  sufficient 
to  consider  the  movement  of  the  gene  frequency  to  its  adjacent 
classes.   This  population,  with  gene  frequency  x,  will  move  to 
another  class  due  to  systematic  as  well  as  random  changes. 

Let  m(x)  At  be  the  probability  that  the  population  moves 
to  the  higher  class  (x  +  h)  by  systematic  pressure.   Let 
v(x)  At  be  the  probability  that  it  moves  outside  the  class  by 
random  fluctuation,  half  of  the  time  to  the  left  class  (x  -  h) 
and  the  other  half  to  the  right  class  (x  +  h).   Movement  to 
other  than  adjacent  classes  is  neglected  due  to  the  very  small 
probability. 

Thus  the  probability  that  the  population  will  have  gene 
frequency  x  +  l/2  h  after  At  is  obtained  by  considering  the 
exchange  of  gene  frequencies  between  these  adjacent  classes. 
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0(x,t  +  At)h  =  0(x,t)h  - 

fv(x  -  h) 


v(x)  +  m(x) 


At  0(x,t)h 


At  S2f(x  -  h,  t)h 


v(x  +  h) 


At  0(x  +  h,  t)h 


+  m(x  -  h)  At  0(x  -   h,  t)h 


W) 


The  second  term  on  the  right  is  the  amount  of  loss  due  to 
movement  to  other  classes,  the  third  term  is  contribution  from 

left  class,  the  fourth- by  the  right  class  both  due  to  random 

change,  and  the  last  term  is  the  contribution  from  the  left 
class  due  to  systematic  change. 

Let  6"2(x,  t)  At  be  the  variance  of  the  change  in  x  per  At 
due  to  random  change, 


5"2(x,t)  At   -   h2 


v(x) 


L  2 


At  +  (-h) 


v(x) 


1-  2 


At 


so 


so 


6"2(x,t)  =  h2  v(x)  . 

Let  M(x,  t)  At  be  the  mean  change  in  x  per   At, 
M(x,  t)  At  =  h  m(x)  At 

M(x,  t)  =  m(x)h. 


(1+6) 


(1+7) 


Now  substitute  (1+6)  and  (i+7)  into  (1+5)  and  divide  both  sides 
by   At    •  h.   Then  on  rearrangement 


50 


I    I — I    I    I 

x-h       x      .'x+h 


y  =  0(x,    t) 


Pig.    $. 


» 
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tf(x,t   +     At)    -   #(x,t) 

At 

(f2(x+h,t)^(x+h,t)-6'2(x,t)^(x,t)   52(x,t)^(x,t)-6"2(x-h,t)^(x-h,t) 


h 


M(x,t)0(x,t)  -  M(x-h,t)^(x-h,t; 


(48) 


h 


Taking  the  limit   At  ->  0,  h  ->  0  gives: 
6tf(x,t)   1  32 


3t       2  3x2 


<$-^(x,t)^(x,t) 


a 

6* 


M(x,t)^(x,t) 


(24-9) 


This  is  known  as  the  Fokker-Planck  equation  and  also  as 
the  Kolmogorov  forward  solution. 

Rewriting  (I4.9)  where  4q  represents  the  tendency  toward  a 
stable  equilibrium  point  due  to  systematic  pressure  and  6q  is 
tendency  to  drift  away  from  that  point  due  to  random  deviation 
and  where  the  mean  change  is  taken  as  zero: 


d 

8q 


(50: 


Then,  according  to  Li  (1955),'  integrating  (50)  gives 

feq2  ^^  +   constant-       (51) 


1  rl 

2  6q 


At  this  point  it  is  seen  that  the  left-hand  member  Aq  •  ^(q) 
represents  the  fraction  of  the  distribution  that  tends  to  be 
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carried  past  a  given  value  of  q  by  the  systematic  pressure  Aq 
in  each  generation.  Since  the  distribution  is  stationary,  the 
right-hand  side 


;  I  k< " 


q) 


1    d   r 


i|U   dq 


q(l  -  q) 


JZf(q)] 


must  be  the  fraction  of  the  distribution  -which  tends  to  be 
scattered  away  in  the  opposite  direction  by  random  deviations 
in  each  generation. 
Rewriting  (5l) 


Aq 


l_6o~q 


tf(q)J   =" 


2   dq 


66q£ 

2  Aq    d/dq    gg"q2  gf(q) 


|6o"q 


tf(q)| 


6Ta2  6JT  *«> 


then  integrating  again, 


dq 


2   g  dq  =  log   [^2  (2f(q) 


6o~c 


+  constant, 


Therefore 


J*(q) 


6o~a 


exp 


>6q 


P  ^q 

6qqc 


dq 


and  where  C  is  a  constant  such  that 

n 


(Zf(q)dq  =  1. 


(52) 
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This  is  the  general  formula  (£2)  for  the  distribution 
of  a  gene  frequency  when  a  steady  state  (under  the  joint 
actions  of  Aq   and  6q)  has  been  reached,  as  given  by  Wright 
(1937,  1938a,  1938b,  1942a). 
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Genetic  drift  due  to  random  sampling  of  gametes  and  due  to 
fluctuation  of  selection  intensities  is  presented.   Both  ideas 
are  considered  as  stochastic  processes  and  are  treated  as  such. 
Fisher,  using  a  differential  equation,  was  the  first  to  give  a 
mathematical  treatment  for  the  first  case  in  finite  populations. 
His  result  for  the  rate  of  decay  of  variance  was  not  correct, 
being  only  half  large  enough.   Wright,  using  path  coefficients 
and  an  integral  equation,  gave  the  correct  solution  as  l/2  N 
per  generation.   This  rate  of  steady  decay  was  later  expanded 
by  Kimura.   By  using  the  Pokker-Planck  equation  and  computing 
the  moments  of  the  distribution,  he  agreed  with  Wright's  results 
and  also  obtained  a  solution  which  assumed  an  infinite  series 
under  the  continuous  model,  showing  that  the  process  approaches 
asymptotically  the  state  of  steady  decay.   It  is  found  that 
given  enough  generations,  the  gene  in  question  will  be  either 
completely  lost  or  completely  fixed  in  the  population. 

For  the  case  of  drift  due  to  fluctuation  of  selection  in- 
tensities, it  is  found  that  again  the  gene  frequency  becomes 
fixed  and  reaches  this  fixation  asymptotically,  but  not  neces- 
sarily is  completely  lost  or  fixed  at  gene  frequency  1.0.   It 
is  found  that  the  rate  of  decay  is  about  V3/8,  where  Vs  is  the 
variance  of  the  selection  intensities. 

A  comparison  is  made  of  these  two  types  of  genetic  drift 
and  examples  are  given. 

A  derivation  of  the  Fokker-Planck  equation  and  its  use  to 
derive  the  distribution  function  given  by  Wright  is  given  in 
the  Appendix. 


